Inexpensive Computation of the Inverse of the Genomic Relationship Matrix in Populations with Small Effective Population Size

نویسنده

  • Ignacy Misztal
چکیده

Many computations with SNP data including genomic evaluation, parameter estimation, and genome-wide association studies use an inverse of the genomic relationship matrix. The cost of a regular inversion is cubic and is prohibitively expensive for large matrices. Recent studies in cattle demonstrated that the inverse can be computed in almost linear time by recursion on any subset of ∼10,000 individuals. The purpose of this study is to present a theory of why such a recursion works and its implication for other populations. Assume that, because of a small effective population size, the additive information in a genotyped population has a small dimensionality, even with a very large number of SNP markers. That dimensionality is visible as a limited number of effective SNP effects, independent chromosome segments, or the rank of the genomic relationship matrix. Decompose a population arbitrarily into core and noncore individuals, with the number of core individuals equal to that dimensionality. Then, breeding values of noncore individuals can be derived by recursions on breeding values of core individuals, with coefficients of the recursion computed from the genomic relationship matrix. A resulting algorithm for the inversion called "algorithm for proven and young" (APY) has a linear computing and memory cost for noncore animals. Noninfinitesimal genetic architecture can be accommodated through a trait-specific genomic relationship matrix, possibly derived from Bayesian regressions. For populations with small effective population size, the inverse of the genomic relationship matrix can be computed inexpensively for a very large number of genotyped individuals.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparing Different Marker Densities and Various Reference Populations Using Pedigree-Marker Best Linear Unbiased Prediction (BLUP) Model

In order to have successful application of genomic selection, reference population and marker density should be chosen properly. This study purpose was to investigate the accuracy of genomic estimated breeding values in terms of low (5K), intermediate (50K) and high (777K) densities in the simulated populations, when different scenarios were applied about the reference populations selecting. Af...

متن کامل

Effect of Reference Population Size and Imputation Methods on the Accuracy of Imputation in Pure and Mixed Populations

    Imputation as a method of creating low-density chips to high-density chips has been introduced to increase the accuracy of genomic selection in animals. In the current study, to investing imputation accuracy, three populations of mixed (scenario 1), pure (scenario 2) and mixed + pure (scenario 3) were simulated using QMSim. Two methods of imputation including Beagle and Flmpute were used fo...

متن کامل

برآورد صحت انتخاب ژنومی در جوامع کوچک ژنتیکی- مطالعه‌ شبیه‌سازی

In the present study two genetically connected small and large populations were simulated and the effect of different sources of information from foreign populations on the accuracy of predicted genomic breeding values of young animals of the small population was investigated. A large population consist of 200000 animals over 15 generations and a small population consist of 5000 animals over 3 ...

متن کامل

The Dimensionality of Genomic Information and Its Effect on Genomic Prediction.

The genomic relationship matrix (GRM) can be inverted by the algorithm for proven and young (APY) based on recursion on a random subset of animals. While a regular inverse has a cubic cost, the cost of the APY inverse can be close to linear. Theory for the APY assumes that the optimal size of the subset (maximizing accuracy of genomic predictions) is due to a limited dimensionality of the GRM, ...

متن کامل

Effect of Markers Effect Estimation Methods, Population Structure and Trait Architercture on the Accuracy of Genomic Breeding Values

This study aimed to investigate the  effect  of  the method of estimating the effects of markers , QTLs distribution, number of QTLs, effective population size and trait heritability on the accuracy of genomic predictions. Two effective population sizes, 100 and 500 individuals, were simulated by QMSim software. A 100 cM genome including one chromosome was simulated where 500 SNPs and two diffe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 202  شماره 

صفحات  -

تاریخ انتشار 2016